feat(codeceptq): CLI to query HTML with CodeceptJS locators by DavertMik · Pull Request #5550 · codeceptjs/CodeceptJS

DavertMik · 2026-05-04T22:14:20Z

Summary

Adds codeceptq — a standalone CLI that takes an HTML stream (stdin or --file) plus a CodeceptJS locator (CSS / XPath / fuzzy / semantic) and prints matched elements with line numbers and outerHTML snippets.

Designed for AI agents iterating on locators against aiTrace's per-step *_page.html snapshots: "would this locator match at step N?" — answered in milliseconds, no browser, no re-run.

# CSS / XPath
cat output/trace_*/0007_*_page.html | npx codeceptq './/input[@required]'
npx codeceptq '#submit' --file output/trace_*/0007_*_page.html

# semantic locators (label / button text / option / checkbox)
npx codeceptq 'Email' --field --file output/trace_*/0003_*_page.html
npx codeceptq 'Save' '.modal' --click --file output/trace_*/0005_*_page.html

# JSON for scripting
npx codeceptq 'Username' --field --json --file output/trace_*/0002_*_page.html

Changes

New bin/codeceptq.js (registered as codeceptq in package.json#bin).
New lib/command/query.js — parse5-tracked line numbers + xmldom xpath eval. Reuses Locator for CSS→XPath and semantic builders (Locator.field.byText, Locator.clickable.wide, Locator.checkable.byText, Locator.select.byVisibleText).
lib/html.js#formatHtml now passes inline: [] to js-beautify so every element in trace HTML lands on its own line — line numbers from codeceptq map 1:1 to elements.
xpath@0.0.34 promoted from devDependencies → dependencies (already in tree).
Default --snippet length 500 chars; --full for complete outerHTML; --json for tooling.
Exit codes: 0 match, 1 no match, 2 invalid input/XPath.

Tests

test/runner/codeceptq_test.js — 45 tests against test/data/{checkout,github,gitlab,app/drag_drop}.html. Each assertion shows the expected { line, snippet } inline so the test source is also a behavior spec:

expect(parsed.matches).toEqual([
  { line: 87, snippet: '<input type="text" class="form-control" id="firstName" placeholder="" value="" required>' },
  { line: 94, snippet: '<input type="text" class="form-control" id="lastName" placeholder="" value="" required>' },
  // ...
])

Coverage: XPath, CSS (id/class/attr/forced), --field, --click/--clickable, --checkable, --select, fuzzy auto-detect, context scoping, stdin vs --file, --limit, --snippet, --full, --json, exit codes, large fixtures.

Test plan

npx mocha test/runner/codeceptq_test.js → 45 passing
npx mocha test/unit/html_test.js test/unit/utils/trace_test.js → existing tests still pass with new inline: []
npx eslint bin/codeceptq.js lib/command/query.js test/runner/codeceptq_test.js → clean
Smoke against real examples/output/trace_*/*_page.html — finds all 17 inputs with line numbers
Reviewer: try npx codeceptq 'something' --file output/trace_*/<step>_page.html after a real test run

🤖 Generated with Claude Code

Adds `codeceptq` — a standalone CLI that takes an HTML stream (stdin or --file) plus a CodeceptJS locator (CSS / XPath / fuzzy / semantic) and prints matched elements with line numbers and outerHTML snippets. Designed to give AI agents a fast feedback loop against `aiTrace`'s per-step HTML snapshots: "would this locator match at step N?" without re-running the test or spawning a browser. - Reuses Locator class for CSS→XPath conversion + semantic builders (--field, --click, --checkable, --select). - Optional context arg scopes matches: `codeceptq 'Save' '.modal' --click`. - Stable output flags: --limit, --snippet (default 500), --full, --json. - Exit codes: 0 match, 1 no match, 2 invalid input/XPath. - formatHtml now uses `inline: []` so every element gets its own line in trace HTML — line numbers map 1:1 to elements for codeceptq output. - 45 runner tests against test/data/checkout.html, github.html, gitlab.html, drag_drop.html assert exact line + snippet for every locator strategy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

run_test, run_step_by_step, and pausedPayload now include aiTraceDir (the per-test output/trace_<title>_<hash>/ folder) so agents can point codeceptq directly at the saved *_page.html snapshots without globbing or recomputing the hash. Per-test entries in reporterJson.tests[] also carry the dir. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

# Conflicts: # bin/mcp-server.js

The 'Sign up' --click case on github.html (2k-line fixture, 12-branch semantic union XPath) takes ~8s locally and exceeds the default 10s mocha timeout on slower CI runners. Suite-level timeout matches what the local runs already use. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@id

Locator.clickable.wide and field.labelContains emit predicates of form [@aria-labelledby = //*[@id][normalize-space(string(.)) = 'X']/@id ]. xpath@0.0.34 re-runs the inner //* scan once per outer element match — O(N²) on non-trivial docs. The 2k-line github fixture spent 8.5s in that single branch out of 12. Pre-resolve the inner subquery once, splice the resulting id (or a sentinel for no-match) back as a literal so the engine sees a flat attribute compare. Github 'Sign up' --click: 9026ms → 276ms (~33×). Full runner suite: 14s → 6s. Reverts the 30s describe-level timeout from the previous commit since the underlying perf issue is now fixed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@for

Replaces the post-hoc regex pre-resolver with strategy-level construction. Each semantic locator (--click/--field/--checkable) is built as a list of XPath branches; doc-wide subqueries (label[@for] resolution, ids by visible text) are evaluated once and inlined as literal predicates instead of sitting nested inside outer per-element predicates that the engine re-executes on every match. Eval loop runs each branch separately and sorts results by source offset to preserve the document-order contract of XPath unions. Github 'Sign up' --click: 9000ms → 264ms (independent of XPath engine — fontoxpath benched the same as xpath@0.0.34 on the original union). All 45 runner tests pass with identical line/snippet output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@id

…cate The wide clickable / labelContains field XPath includes: .//*[@aria-labelledby = //*[@id][normalize-space(string(.)) = X]/@id] That predicate forces every element to evaluate the inner //*[@id] subquery, which is O(N²) on any non-trivial document for pure-JS XPath engines (xpath npm: 7641ms on a 2k-line page; fontoxpath: 7057ms on the same branch). Browser engines optimize via join-pushdown. Adding [@aria-labelledby] as a left-to-right filter predicate first cuts the slow comparison to only elements that actually have the attribute: .//*[@aria-labelledby][@aria-labelledby = //*[@id][...]/@id] 7641ms → 52ms (147×). Semantics identical: in XPath, [A][B] and [A and B] produce the same result-set, but predicates are evaluated left-to-right, so the cheap attr-existence check filters out the bulk first. This is a single-character XPath change — codeceptq goes from 9000ms → 325ms on test/data/github.html with no special-case code. Reverted the per-strategy reimplementation in lib/command/query.js (back to using Locator.clickable.wide / Locator.field.byText directly). Added two unit tests for the aria-labelledby branch in Locator.clickable.wide (positive + negative). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

DavertMik and others added 16 commits April 26, 2026 22:27

update docs

f572b8e

updated docs, added browser plugin

1ae964b

Merge branch '4.x' of github.com:codeceptjs/CodeceptJS into 4.x

4326bcc

Merge branch '4.x' of github.com:codeceptjs/CodeceptJS into 4.x

82e760f

Merge branch '4.x' of github.com:codeceptjs/CodeceptJS into 4.x

a14f976

Merge branch '4.x' of github.com:codeceptjs/CodeceptJS into 4.x

3adc3f1

Merge branch '4.x' of github.com:codeceptjs/CodeceptJS into 4.x

867f7e3

Merge branch '4.x' of github.com:codeceptjs/CodeceptJS into 4.x

a57d587

Merge branch '4.x' of github.com:codeceptjs/CodeceptJS into 4.x

af8de03

Merge remote-tracking branch 'origin/4.x' into feat/cq-parser

8ae2869

# Conflicts: # bin/mcp-server.js

DavertMik merged commit 4f0fa49 into 4.x May 5, 2026
10 checks passed

DavertMik deleted the feat/cq-parser branch May 5, 2026 22:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(codeceptq): CLI to query HTML with CodeceptJS locators#5550

feat(codeceptq): CLI to query HTML with CodeceptJS locators#5550
DavertMik merged 16 commits into4.xfrom
feat/cq-parser

DavertMik commented May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

DavertMik commented May 4, 2026

Summary

Changes

Tests

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant